Search Results: "abi"

12 February 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, January 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In January, 16 contributors have been paid to work on Debian LTS, their reports are available:

Abhijith PA did 14.0h (out of 7.0h assigned and 7.0h from previous period).

Bastien Roucari s did 22.0h (out of 16.0h assigned and 6.0h from previous period).

Ben Hutchings did 14.5h (out of 8.0h assigned and 16.0h from previous period), thus carrying over 9.5h to the next month.

Chris Lamb did 18.0h (out of 18.0h assigned).

Daniel Leidert did 10.0h (out of 10.0h assigned).

Emilio Pozuelo Monfort did 10.0h (out of 14.75h assigned and 27.0h from previous period), thus carrying over 31.75h to the next month.

Guilhem Moulin did 9.75h (out of 25.0h assigned), thus carrying over 15.25h to the next month.

Holger Levsen did 3.5h (out of 12.0h assigned), thus carrying over 8.5h to the next month.

Markus Koschany did 40.0h (out of 40.0h assigned).

Roberto C. S nchez did 8.75h (out of 9.5h assigned and 2.5h from previous period), thus carrying over 3.25h to the next month.

Santiago Ruano Rinc n did 13.5h (out of 8.25h assigned and 7.75h from previous period), thus carrying over 2.5h to the next month.

Sean Whitton did 0.5h (out of 0.25h assigned and 5.75h from previous period), thus carrying over 5.5h to the next month.

Sylvain Beucler did 9.5h (out of 23.25h assigned and 18.5h from previous period), thus carrying over 32.25h to the next month.

Thorsten Alteholz did 14.0h (out of 14.0h assigned).

Tobias Frost did 12.0h (out of 10.25h assigned and 1.75h from previous period).

Utkarsh Gupta did 8.5h (out of 35.75h assigned), thus carrying over 24.75h to the next month.

Evolution of the situation In January, we have released 25 DLAs. A variety of particularly notable packages were updated during January. Among those updates were the Linux kernel (both versions 5.10 and 4.19), mariadb-10.3, openjdk-11, firefox-esr, and thunderbird. In addition to the many other LTS package updates which were released in January, LTS contributors continue their efforts to make impactful contributions both within the Debian community.

Thanks to our sponsors Sponsors that joined recently are in bold.

Platinum sponsors:

TOSHIBA (for 101 months)

Civil Infrastructure Platform (CIP) (for 69 months)

Gold sponsors:

Roche Diagnostics International AG (for 112 months)

Linode (for 106 months)

Babiel GmbH (for 95 months)

Plat Home (for 95 months)

CINECA (for 69 months)

University of Oxford (for 51 months)

Deveryware (for 38 months)

VyOS Inc (for 33 months)

EDF SA (for 22 months)

Silver sponsors:

Domeneshop AS (for 116 months)

Nantes M tropole (for 110 months)

Univention GmbH (for 102 months)

Universit Jean Monnet de St Etienne (for 102 months)

Ribbon Communications, Inc. (for 96 months)

Exonet B.V. (for 86 months)

Leibniz Rechenzentrum (for 80 months)

Minist re de l Europe et des Affaires trang res (for 63 months)

Cloudways by DigitalOcean (for 53 months)

Dinahosting SL (for 51 months)

Bauer Xcel Media Deutschland KG (for 45 months)

Platform.sh SAS (for 45 months)

Moxa Inc. (for 39 months)

sipgate GmbH (for 36 months)

OVH US LLC (for 34 months)

Tilburg University (for 34 months)

GSI Helmholtzzentrum f r Schwerionenforschung GmbH (for 26 months)

Soliton Systems K.K. (for 23 months)

Bronze sponsors:

Evolix (for 117 months)

Seznam.cz, a.s. (for 117 months)

Intevation GmbH (for 114 months)

Linuxhotel GmbH (for 114 months)

Daevel SARL (for 112 months)

Bitfolk LTD (for 111 months)

Megaspace Internet Services GmbH (for 111 months)

Greenbone AG (for 110 months)

NUMLOG (for 110 months)

WinGo AG (for 110 months)

Ecole Centrale de Nantes - LHEEA (for 106 months)

Entr ouvert (for 101 months)

Adfinis AG (for 98 months)

GNI MEDIA (for 93 months)

Laboratoire LEGI - UMR 5519 / CNRS (for 93 months)

Tesorion (for 93 months)

Bearstech (for 84 months)

LiHAS (for 84 months)

Catalyst IT Ltd (for 79 months)

Supagro (for 74 months)

Demarcq SAS (for 73 months)

Universit Grenoble Alpes (for 59 months)

TouchWeb SAS (for 51 months)

SPiN AG (for 48 months)

CoreFiling (for 43 months)

Institut des sciences cognitives Marc Jeannerod (for 38 months)

Observatoire des Sciences de l Univers de Grenoble (for 35 months)

Tem Innovations GmbH (for 30 months)

WordFinder.pro (for 29 months)

CNRS DT INSU R sif (for 28 months)

Alter Way (for 21 months)

Institut Camille Jordan (for 10 months)

9 February 2024

Reproducible Builds (diffoscope): diffoscope 256 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 256. This version includes the following changes:

* CVE-2024-25711: Use a determistic name when extracting content from GPG
  artifacts instead of trusting the value of gpg's --use-embedded-filenames.
  This prevents a potential information disclosure vulnerability that could
  have been exploited by providing a specially-crafted GPG file with an
  embedded filename of, say, "../../.ssh/id_rsa".
  Many thanks to Daniel Kahn Gillmor <dkg@debian.org> for reporting this
  issue and providing feedback.
  (Closes: reproducible-builds/diffoscope#361)
* Temporarily fix support for Python 3.11.8 re. a potential regression
  with the handling of ZIP files. (See reproducible-builds/diffoscope#362)

You find out more by visiting the project homepage.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s `README` file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called `archlinux-userland-fs-cmp`, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], `/mnt`. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our `SOURCE_DATE_EPOCH` code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the `SOURCE_DATE_EPOCH` environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of `time_t`, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about `SOURCE_DATE_EPOCH` to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions `254` and `255` to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:

Debian-related changes:

Reduce the number of `arm64` architecture workers from 24 to 16. [ ]

Use diffoscope from the Debian release being tested again. [ ]

Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].

Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]

Display the architecture of builds that have been running for more than 48 hours. [ ]

Reboot `arm64` nodes when they hit an OOM (out of memory) state. [ ]

Package rescheduling changes:

Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]

Correctly set `SUDO_USER` when rescheduling packages. [ ]

Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]

OpenWrt-related changes:

Install the `python3-dev` and `python3-pyelftools` packages as they are now needed for the `sunxi` target. [ ][ ]

Also install the `libpam0g-dev` which is needed by some OpenWrt hardware targets. [ ]

Misc:

As it s January, set the `real_year` variable to 2024 [ ] and bump various copyright years as well [ ].

Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]

Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]

Install the `iptables` tool everywhere, else our custom `rc.local` script fails. [ ]

Cleanup the `/srv/workspace/pbuilder` directory on boot. [ ]

Automatically restart Squid if it fails. [ ]

Limit the execution of `chroot-installation` jobs to a maximum of 4 concurrent runs. [ ][ ]

Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian `armhf` architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for `@reproducible-builds.org` to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:

`cython` (nondeterminstic path issue)

`deluge` (issue with modification time of `.egg` file)

`gap-ferret`, `gap-semigroups` & `gap-simpcomp` (nondeterministic `config.log` file)

`grpc` (filesystem ordering issue )

`hub` (random)

`kubernetes1.22` & `kubernetes1.23` (sort-related issue)

`kubernetes1.24` & `kubernetes1.25` (`go -trimpath` vs random issue)

`libjcat` (drop test files with random bytes)

`luajit` (Use new `d` option for deterministic bytecode output)

`meson` [ ][ ] (sort the results from Python filesystem call)

`python-rjsmin` (drop GCC instrumentation artifacts)

`qt6-virtualkeyboard+others` (bug parallelism/race)

`SoapySDR` (parallelism-related issue)

`systemd` (sorting problem)

`warewulf` (CPIO modification time issue, etc.)

Chris Lamb:

#1060254 filed against `mumble`.

James Addison:

`guake` ( Schroedinger file due to race condition)

`qhelpgenerator-qt5` (timezone localization; fix also merged upstream for QT6)

`sphinx` (search index `doctitle` sorting)

Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the `mm-common` package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Mailing list: `rb-general@lists.reproducible-builds.org`

Mastodon: @reproducible_builds

Twitter: @ReproBuilds

6 February 2024

Louis-Philippe V ronneau: Montreal's Debian & Stuff - February 2024

New Year, Same Great People! Our Debian User Group met for the first of our 2024 bi-monthly meetings on February 4th and it was loads of fun. Around twelve different people made it this time to Koumbit, where the meeting happened. As a reminder, our meetings are called "Debian & Stuff" because we want to be as open as possible and welcome people that want to work on "other stuff" than Debian. Here is what we did: pollo:

tested a laptop that had a defective battery with a known good one (the problem was indeed with the battery :D)
renewed his expiring OpenPGP key
worked on removing trapperkeeper-scheduler-clojure from the DebCI reject_list
helped anarcat with packaging sigal
managed to feed most of the people present :)

LeLutin:

worked with lavamind to upload new upstream release of smokeping

mjeanson:

migrated from lxd to incus on his servers
helped anarcat with flashing his AirGradient

lavamind:

submitted a bug report on r-cran-rserve (promptly fixed/uploaded by maintainer!)
reviewed and uploaded smokeping
bug triaged the facter package
worked on puppet-agent new upstream version 8.4.0

viashimo:

updated puppet-strings from 2.9.0 to 4.2.1
reported upstream test failures on puppet-strings with recent versions of mdl

tvaz & tassia:

drafted a call & request for funding for the Vanier College FLOSS Club hardware marathon at Eastern Bloc
worked on an application to conduct research at Vanier College on Debian usability
babysitted :-)

joeDoe:

worked on his AirGradient
debugged the WiFi and VPN setup on his new laptop

anarcat:

worked on packaging sigal
worked on flashing his AirGradient boards
rewrote a Prometheus exporter for Airgradient devices
brought over a car full of old computer gear to donate

Pictures I was pretty busy this time around and ended up not taking a lot of pictures. Here's a bad one of the ceiling at Koumbit I took, and a picture by anarcat of the content of his boxes of loot: A picture of the ceiling at Koumbit

2 February 2024

Ben Hutchings: FOSS activity in January 2024

I fixed a bug in my merge request adding the rsync method to dput-ng. (This has now been merged and included in the dput 1.38 release.)
I updated the Linux kernel packages in various Debian suites:
- buster: Updated linux (4.19) to upstream version 4.19.304, uploaded it, and issued DLA-3710-1.
- buster: Updated linux-5.10 to the latest security update for bullseye, uploaded it, and issued DLA-3711-1.
- bullseye-backports: Updated linux to the latest security update for bookworm, and uploaded it.
- bookworm-backports: Updated linux to the latest version in sid, but didn't upload it as that version is not yet in testing.
I uploaded kernel-wedge to bookworm-backports as it will be needed by later updates to the linux backport.
I reviewed the enablement of bcachefs in Debian.
I updated the Debian kernel upload checklist in line with recent changes to kernel ABI handling.
I reviewed (again) Bastian's proposal to change the use of Git branches for the Debian kernel package.
I discussed the support status of armel in the Debian kernel package.

Ian Jackson: UPS, the Useless Parcel Service; VAT and fees

I recently had the most astonishingly bad experience with UPS, the courier company. They severely damaged my parcels, and were very bad about UK import VAT, ultimately ending up harassing me on autopilot. The only thing that got their attention was my draft Particulars of Claim for intended legal action. Surprisingly, I got them to admit in writing that the disbursement fee they charge recipients alongside the actual VAT, is just something they made up with no legal basis. What happened Autumn last year I ordered some furniture from a company in Germany. This was to be shipped by them to me by courier. The supplier chose UPS. UPS misrouted one of the three parcels to Denmark. When everything arrived, it had been sat on by elephants. The supplier had to replace most of it, with considerable inconvenience and delay to me, and of course a loss to the supplier. But this post isn t mostly about that. This post is about VAT. You see, import VAT was due, because of fucking Brexit. UPS made a complete hash of collecting that VAT. Their computers can t issue coherent documents, their email helpdesk is completely useless, and their automated debt collection systems run along uninfluenced by any external input. The crazy, including legal threats and escalating late payment fees, continued even after I paid the VAT discrepancy (which I did despite them not yet having provided any coherent calculation for it). This kind of behaviour is a very small and mild version of the kind of things British Gas did to Lisa Ferguson, who eventually won substantial damages for harassment, plus 10K of costs. Having tried asking nicely, and sending stiff letters, I too threatened litigation. I would have actually started a court claim, but it would have included a claim under the Protection from Harassment Act. Those have to be filed under the Part 8 procedure , which involves sending all of the written evidence you re going to use along with the claim form. Collating all that would be a good deal of work, especially since UPS and ControlAccount didn t engage with me at all, so I had no idea which things they might actually dispute. So I decided that before issuing proceedings, I d send them a copy of my draft Particulars of Claim, along with an offer to settle if they would pay me a modest sum and stop being evil robots at me. Rather than me typing the whole tale in again, you can read the full gory details in the PDF of my draft Particulars of Claim. (I ve redacted the reference numbers). Outcome The draft Particulars finally got their attention. UPS sent me an offer: they agreed to pay me 50, in full and final settlement. That was close enough to my offer that I accepted it. I mostly wanted them to stop, and they do seem to have done so. And I ve received the 50. VAT calculation They also finally included an actual explanation of the VAT calculation. It s absurd, but it s not UPS s absurd:

The clearance was entered initially with estimated import charges of 400.03, consisting of 387.83 VAT, and 12.20 disbursement fee. This original entry regrettably did not include the freight cost for calculating the VAT, and as such when submitted for final entry the VAT value was adjusted to include this and an amended invoice was issued for an additional 39.84. HMRC calculate the amount against which VAT is raised using the value of goods, insurance and freight, however they also may apply a VAT adjustment figure. The VAT Adjustment is based on many factors (Incidental costs in regards to a shipment), which includes charge for currency conversion if the invoice does not list values in Sterling, but the main is due to the inland freight from airport of destination to the final delivery point, as this charge varies, for example, from EMA to Edinburgh would be 150, from EMA to Derby would be 1, so each year UPS must supply HMRC with all values incurred for entry build up and they give an average which UPS have to use on the entry build up as the VAT Adjustment. The correct calculation for the import charges is therefore as follows: Goods value divided by exchange rate 2,489.53 EUR / 1.1683 = 2,130.89 GBP Duty: Goods value plus freight (%) 2,130.89 GBP + 5% = 2,237.43 GBP. That total times the duty rate. X 0 % = 0 GBP VAT: Goods value plus freight (100%) 2,130.89 GBP + 0 = 2,130.89 GBP That total plus duty and VAT adjustment 2,130.89 GBP + 0 GBP + 7.49 GBP = 2,348.08 GBP. That total times 20% VAT = 427.67 GBP As detailed above we must confirm that the final VAT charges applied to the shipment were correct, and that no refund of this is therefore due.

This looks very like HMRC-originated nonsense. If only they had put it on the original bills! It s completely ridiculous that it took four months and near-litigation to obtain it. Disbursement fee One more thing. UPS billed me a 12 disbursement fee . When you import something, there s often tax to pay. The courier company pays that to the government, and the consignee pays it to the courier. Usually the courier demands it before final delivery, since otherwise they end up having to chase it as a debt. It is common for parcel companies to add a random fee of their own. As I note in my Particulars, there isn t any legal basis for this. In my own offer of settlement I proposed that UPS should:

State under what principle of English law (such as, what enactment or principle of Common Law), you levy the disbursement fee (or refund it).

To my surprise they actually responded to this in their own settlement letter. (They didn t, for example, mention the harassment at all.) They said (emphasis mine):

A disbursement fee is a fee for amounts paid or processed on behalf of a client. It is an established category of charge used by legal firms, amongst other companies, for billing of various ancillary costs which may be incurred in completion of service. Disbursement fees are not covered by a specific law, nor are they legally prohibited. Regarding UPS disbursement fee this is an administrative charge levied for the use of UPS deferment account to prepay import charges for clearance through CDS. This charge would therefore be billed to the party that is responsible for the import charges, normally the consignee or receiver of the shipment in question. The disbursement fee as applied is legitimate, and as you have stated is a commonly used and recognised charge throughout the courier industry, and I can confirm that this was charged correctly in this instance.

On UPS s analysis, they can just make up whatever fee they like. That is clearly not right (and I don t even need to refer to consumer protection law, which would also make it obviously unlawful). And, that everyone does it doesn t make it lawful. There are so many things that are ubiquitous but unlawful, especially nowadays when much of the legal system - especially consumer protection regulators - has been underfunded to beyond the point of collapse. Next time this comes up I might have a go at getting the fee back. (Obviously I ll have to pay it first, to get my parcel.) ParcelForce and Royal Mail I think this analysis doesn t apply to ParcelForce and (probably) Royal Mail. I looked into this in 2009, and I found that Parcelforce had been given the ability to write their own private laws: Schemes made under section 89 of the Postal Services Act 2000. This is obviously ridiculous but I think it was the law in 2009. I doubt the intervening governments have fixed it. Furniture Oh, yes, the actual furniture. The replacements arrived intact and are great :-).

comments

26 January 2024

Bastian Venthur: Investigating popularity of Python build backends over time

Inspired by a Mastodon post by Fran oise Conil, who investigated the current popularity of build backends used in pyproject.toml files, I wanted to investigate how the popularity of build backends used in pyproject.toml files evolved over the years since the introduction of PEP-0517 in 2015. Getting the data Tom Forbes provides a huge dataset that contains information about every file within every release uploaded to PyPI. To get the current dataset, we can use:

curl -L --remote-name-all $(curl -L "https://github.com/pypi-data/data/raw/main/links/dataset.txt")

This will download approximately 30GB of parquet files, providing detailed information about each file included in a PyPI upload, including:

project name, version and release date
file path, size and line count
hash of the file

The dataset does not contain the actual files themselves though, more on that in a moment. Querying the dataset using duckdb We can now use duckdb to query the parquet files directly. Let s look into the schema first:

describe select * from '*.parquet';
 
    column_name     column_type    null    
      varchar         varchar     varchar  
 
  project_name      VARCHAR       YES      
  project_version   VARCHAR       YES      
  project_release   VARCHAR       YES      
  uploaded_on       TIMESTAMP     YES      
  path              VARCHAR       YES      
  archive_path      VARCHAR       YES      
  size              UBIGINT       YES      
  hash              BLOB          YES      
  skip_reason       VARCHAR       YES      
  lines             UBIGINT       YES      
  repository        UINTEGER      YES      
 
  11 rows                       6 columns

From all files mentioned in the dataset, we only care about pyproject.toml files that are in the project s root directory. Since we ll still have to download the actual files, we need to get the path and the repository to construct the corresponding URL to the mirror that contains all files in a bunch of huge git repositories. Some files are not available on the mirrors; to skip these, we only take files where the skip_reason is empty. We also care about the timestamp of the upload (uploaded_on) and the hash to avoid processing identical files twice:

select
    path,
    hash,
    uploaded_on,
    repository
from '*.parquet'
where
    skip_reason == '' and
    lower(string_split(path, '/')[-1]) == 'pyproject.toml' and
    len(string_split(path, '/')) == 5
order by uploaded_on desc

This query runs for a few minutes on my laptop and returns ~1.2M rows. Getting the actual files Using the repository and path, we can now construct an URL from which we can fetch the actual file for further processing:

url = f"https://raw.githubusercontent.com/pypi-data/pypi-mirror- repository /code/ path "

We can download the individual pyproject.toml files and parse them to read the build-backend into a dictionary mapping the file-hash to the build backend. Downloads on GitHub are rate-limited, so downloading 1.2M files will take a couple of days. By skipping files with a hash we ve already processed, we can avoid downloading the same file more than once, cutting the required downloads by circa 50%. Results Assuming the data is complete and my analysis is sound, these are the findings: There is a surprising amount of build backends in use, but the overall amount of uploads per build backend decreases quickly, with a long tail of single uploads:

>>> results.backend.value_counts()
backend
setuptools        701550
poetry            380830
hatchling          56917
flit               36223
pdm                11437
maturin             9796
jupyter             1707
mesonpy              625
scikit               556
                   ...
postry                 1
tree                   1
setuptoos              1
neuron                 1
avalon                 1
maturimaturinn         1
jsonpath               1
ha                     1
pyo3                   1
Name: count, Length: 73, dtype: int64

We pick only the top 4 build backends, and group the remaining ones (including PDM and Maturin) into other so they are accounted for as well. The following plot shows the relative distribution of build backends over time. Each bin represents a time span of 28 days. I chose 28 days to reduce visual clutter. Within each bin, the height of the bars corresponds to the relative proportion of uploads during that time interval:

Relative distribution of build backends over time

Looking at the right side of the plot, we see the current distribution. It confirms Fran oise s findings about the current popularity of build backends:

Setuptools: ~50%
Poetry: ~33%
Hatch: ~10%
Flit: ~3%
Other: ~4%

Between 2018 and 2020 the graph exhibits significant fluctuations, due to the relatively low amount uploads utizing pyproject.toml files. During that early period, Flit started as the most popular build backend, but was eventually displaced by Setuptools and Poetry. Between 2020 and 2020, the overall usage of pyproject.toml files increased significantly. By the end of 2022, the share of Setuptools peaked at 70%. After 2020, other build backends experienced a gradual rise in popularity. Amongh these, Hatch emerged as a notable contender, steadily gaining traction and ultimately stabilizing at 10%. We can also look into the absolute distribution of build backends over time:

Absolute distribution of build backends over time

The plot shows that Setuptools has the strongest growth trajectory, surpassing all other build backends. Poetry and Hatch are growing at a comparable rate, but since Hatch started roughly 4 years after Poetry, it s lagging behind in popularity. Despite not being among the most widely used backends anymore, Flit maintains a steady and consistent growth pattern, indicating its enduring relevance in the Python packaging landscape. The script for downloading and analyzing the data can be found in my GitHub repository. It contains the results of the duckb query (so you don t have to download the full dataset) and the pickled dictionary, mapping the file hashes to the build backends, saving you days for downloading and analyzing the pyproject.toml files yourself.

Dima Kogan: mrcal 2.4 released!

mrcal 2.4 is out: the release notes. Once again, this is mostly a bug-fix release en route to the big new features coming in 3.0. The most noteworthy fixes:

mrcal can be built with clang. Try it out like this: CC=clang CXX=clang++ make. This opens up some portability improvements, such as making it easier to run on Windows.
Full dense stereo pipeline in C.
Tools to support more file formats:
These are experimental. Please let me know if these are or aren't useful

The portability work was motivated by Matt Morley, who was interested in integrating mrcal into PhotonVision, the toolkit used by students in the FIRST Robotics Competition. Matt completed that work, and mrcal is now a part of PhotonVision 2024.1.2! Thanks, Matt! I don't know if there will be a mrcal 2.5, but the next interesting release will be mrcal 3.0. The biggest internal rework is complete: the new cross-reprojection uncertainty quantification method is implemented, tested and documented. The results are very promising, but lots needs to happen before we can reliably compute intrinsics without chessboards and produce full SFM solves in mrcal and all the related things.

25 January 2024

Dimitri John Ledkov: Ubuntu Livepatch service now supports over 60 different kernels

Linux kernel getting a livepatch whilst running a marathon. Generated with AI.

Livepatch service eliminates the need for unplanned maintenance windows for high and critical severity kernel vulnerabilities by patching the Linux kernel while the system runs. Originally the service launched in 2016 with just a single kernel flavour supported.Over the years, additional kernels were added: new LTS releases, ESM kernels, Public Cloud kernels, and most recently HWE kernels too.Recently livepatch support was expanded for FIPS compliant kernels, Public cloud FIPS compliant kernels, and as well IBM Z (mainframe) kernels. Bringing the total of kernel flavours support to over 60 distinct kernel flavours supported in parallel. The table of supported kernels in the documentation lists the supported kernel flavours ABIs, the duration of individual build's support window, supported architectures, and the Ubuntu release. This work was only possible thanks to the collaboration with the Ubuntu Certified Public Cloud team, engineers at IBM for IBM Z (s390x) support, Ubuntu Pro team, Livepatch server & client teams.It is a great milestone, and I personally enjoy seeing the non-intrusive popup on my Ubuntu Desktop that a kernel livepatch was applied to my running system. I do enable Ubuntu Pro on my personal laptop thanks to the free Ubuntu Pro subscription for individuals.What's next? The next frontier is supporting ARM64 kernels. The Canonical kernel team has completed the gap analysis to start supporting Livepatch Service for ARM64. Upstream Linux requires development work on the consistency model to fully support livepatch on ARM64 processors. Livepatch code changes are applied on a per-task basis, when the task is deemed safe to switch over. This safety check depends mostly on kernel stacktraces. For these checks, CONFIG_HAVE_RELIABLE_STACKTRACE needs to be available in the upstream ARM64 kernel. (see The Linux Kernel Documentation). There are preliminary patches that enable reliable stacktraces on ARM64, however these turned out to be problematic as there are lots of fix revisions that came after the initial patchset that AWS ships with 5.10. This is a call for help from any interested parties. If you have engineering resources and are interested in bringing Livepatch Service to your ARM64 platforms, please reach out to the Canonical Kernel team on the public Ubuntu Matrix, Discourse, and mailing list. If you want to chat in person, see you at FOSDEM next weekend.

22 January 2024

Dirk Eddelbuettel: x13binary 1.1.60 on CRAN: Upstream Update, Updated Build

The x13binary team is thrilled to share the availability of Release 1.1.60-1 of the x13binary package providing the X-13ARIMA-SEATS program by the US Census Bureau which arrived on CRAN earlier today. This release brings the package up to speed with the most current release by the Census Bureau. More importantly, we finally made good on an old promise to ourselves and now install the binary by compiling from its Fortran sources! No more pre-made binaries. This required some work by Kirill, Michael, and Jeroen to finalize matter because, as we all know, the CRAN build processes and tool chains can be a little byzantine in their details. Use on platforms not covered by binaries from CRAN (or r-universe) should just work too as the demands on the (Fortran) compiler are fairly standard. All in all the build is fairly lightweight and quick even when rebuilding from source. Courtesy of my CRANberries, there is also a diffstat report for this release showing changes to the previous release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Russell Coker: Storage Trends 2024

It has been less than a year since my last post about storage trends [1] and enough has changed to make it worth writing again. My previous analysis was that for <2TB only SSD made sense, for 4TB SSD made sense for business use while hard drives were still a good option for home use, and for 8TB+ hard drives were clearly the best choice for most uses. I will start by looking at MSY prices, they aren't the cheapest (you can get cheaper online) but they are competitive and they make it easy to compare the different options. I'll also compare the cheapest options in each size, there are more expensive options but usually if you want to pay more then the performance benefits of SSD (both SATA and NVMe) are even more appealing. All prices are in Australian dollars and of parts that are readily available in Australia, but the relative prices of the parts are probably similar in most countries. The main issue here is when to use SSD and when to use hard disks, and then if SSD is chosen which variety to use. Small Storage For my last post the cheapest storage devices from MSY were $19 for a 128G SSD, now it s $24 for a 128G SSD or NVMe device. I don t think the Australian dollar has dropped much against foreign currencies, so I guess this is partly companies wanting more profits and partly due to the demand for more storage. Items that can t sell in quantity need higher profit margins if they are to have them in stock. 500G SSDs are around $33 and 500G NVMe devices for $36 so for most use cases it wouldn t make sense to buy anything smaller than 500G. The cheapest hard drive is $45 for a 1TB disk. A 1TB SATA SSD costs $61 and a 1TB NVMe costs $79. So 1TB disks aren t a good option for any use case. A 2TB hard drive is $89. A 2TB SATA SSD is $118 and a 2TB NVMe is $145. I don t think the small savings you can get from using hard drives makes them worth using for 2TB. For most people if you have a system that s important to you then $145 on storage isn t a lot to spend. It seems hardly worth buying less than 2TB of storage, even for a laptop. Even if you don t use all the space larger storage devices tend to support more writes before wearing out so you still gain from it. A 2TB NVMe device you buy for a laptop now could be used in every replacement laptop for the next 10 years. I only have 512G of storage in my laptop because I have a collection of SSD/NVMe devices that have been replaced in larger systems, so the 512G is essentially free for my laptop as I bought a larger device for a server. For small business use it doesn t make sense to buy anything smaller than 2TB for any system other than a router. If you buy smaller devices then you will sometimes have to pay people to install bigger ones and when the price is $145 it s best to just pay that up front and be done with it. Medium Storage A 4TB hard drive is $135. A 4TB SATA SSD is $319 and a 4TB NVMe is $299. The prices haven t changed a lot since last year, but a small increase in hard drive prices and a small decrease in SSD prices makes SSD more appealing for this market segment. A common size range for home servers and small business servers is 4TB or 8TB of storage. To do that on SSD means about $600 for 4TB of RAID-1 or $900 for 8TB of RAID-5/RAID-Z. That s quite affordable for that use. For 8TB of less important storage a 8TB hard drive costs $239 and a 8TB SATA SSD costs $899 so a hard drive clearly wins for the specific case of non-RAID single device storage. Note that the U.2 devices are more competitive for 8TB than SATA but I included them in the next section because they are more difficult to install. Serious Storage With 8TB being an uncommon and expensive option for consumer SSDs the cheapest price is for multiple 4TB devices. To have multiple NVMe devices in one PCIe slot you need PCIe bifurcation (treating the PCIe slot as multiple slots). Most of the machines I use don t support bifurcation and most affordable systems with ECC RAM don t have it. For cheap NVMe type storage there are U.2 devices (the enterprise form of NVMe). Until recently they were too expensive to use for desktop systems but now there are PCIe cards for internal U.2 devices, $14 for a card that takes a single U.2 is a common price on AliExpress and prices below $600 for a 7.68TB U.2 device are common that s cheaper on a per-TB basis than SATA SSD and NVMe! There are PCIe cards that take up to 4*U.2 devices (which probably require bifurcation) which means you could have 8+ U.2 devices in one not particularly high end PC for 56TB of RAID-Z NVMe storage. Admittedly $4200 for 56TB is moderately expensive, but it s in the price range for a small business server or a high end home server. A more common configuration might be 2*7.68TB U.2 on a single PCIe card (or 2 cards if you don t have bifurcation) for 7.68TB of RAID-1 storage. For SATA SSD AliExpress has a 6*2.5 hot-swap device that fits in a 5.25 bay for $63, so if you have 2*5.25 bays you could have 12*4TB SSDs for 44TB of RAID-Z storage. That wouldn t be much cheaper than 8*7.68TB U.2 devices and would be slower and have less space. But it would be a good option if PCIe bifurcation isn t possible. 16TB SATA hard drives cost $559 which is almost exactly half the price per TB of U.2 storage. That doesn t seem like a good deal. If you want 16TB of RAID storage then 3*7.68TB U.2 devices only costs about 50% more than 2*16TB SATA disks. In most cases paying 50% more to get NVMe instead of hard disks is a good option. As sizes go above 16TB prices go up in a more than linear manner, I guess they don t sell much volume of larger drives. 15.36TB U.2 devices are on sale for about $1300, slightly more than twice the price of a 16TB disk. It s within the price range of small businesses and serious home users. Also it should be noted that the U.2 devices are designed for enterprise levels of reliability and the hard disk prices I m comparing to are the cheapest available. If NAS hard disks were compared then the price benefit of hard disks would be smaller. Probably the biggest problem with U.2 for most people is that it s an uncommon technology that few people have much experience with or spare parts for testing. Also you can t buy U.2 gear at your local computer store which might mean that you want to have spare parts on hand which is an extra expense. For enterprise use I ve recently been involved in discussions with a vendor that sells multiple petabyte arrays of NVMe. Apparently NVMe is cheap enough that there s no need to use anything else if you want a well performing file server. Do Hard Disks Make Sense? There are specific cases like comparing a 8TB hard disk to a 8TB SATA SSD or a 16TB hard disk to a 15.36TB U.2 device where hard disks have an apparent advantage. But when comparing RAID storage and counting the performance benefits of SSD the savings of using hard disks don t seem to be that great. Is now the time that hard disks are going to die in the market? If they can t get volume sales then prices will go up due to lack of economy of scale in manufacture and increased stock time for retailers. 8TB hard drives are now more expensive than they were 9 months ago when I wrote my previous post, has a hard drive price death spiral already started? SSDs are cheaper than hard disks at the smallest sizes, faster (apart from some corner cases with contiguous IO), take less space in a computer, and make less noise. At worst they are a bit over twice the cost per TB. But the most common requirements for storage are small enough and cheap enough that being twice as expensive as hard drives isn t a problem for most people. I predict that hard disks will become less popular in future and offer less of a price advantage. The vendors are talking about 50TB hard disks being available in future but right now you can fit more than 50TB of NVMe or U.2 devices in a volume less than that of a 3.5 hard disk so for storage density SSD can clearly win. Maybe in future hard disks will be used in arrays of 100TB devices for large scale enterprise storage. But for home users and small businesses the current sizes of SSD cover most uses. At the moment it seems that the one case where hard disks can really compare well is for backup devices. For backups you want large storage, good contiguous write speeds, and low prices so you can buy plenty of them. Further Issues The prices I ve compared for SATA SSD and NVMe devices are all based on the cheapest devices available. I think it s a bit of a market for lemons [2] as devices often don t perform as well as expected and the incidence of fake products purporting to be from reputable companies is high on the cheaper sites. So you might as well buy the cheaper devices. An advantage of the U.2 devices is that you know that they will be reliable and perform well. One thing that concerns me about SSDs is the lack of knowledge of their failure cases. Filesystems like ZFS were specifically designed to cope with common failure cases of hard disks and I don t think we have that much knowledge about how SSDs fail. But with 3 copies of metadata BTFS or ZFS should survive unexpected SSD failure modes. I still have some hard drives in my home server, they keep working well enough and the prices on SSDs keep dropping. But if I was buying new storage for such a server now I d get U.2. I wonder if tape will make a comeback for backup. Does anyone know of other good storage options that I missed?

20 January 2024

Gunnar Wolf: Ruffle helps bring back my family history

Probably a trait of my family s origins as migrants from East Europe, probably part of the collective trauma of jews throughout the world or probably because that s just who I turned out to be, I hold in high regard the preservation of memory of my family s photos, movies and such items. And it s a trait shared by many people in my familiar group. Shortly after my grandmother died 24 years ago, my mother did a large, loving work of digitalization and restoration of my grandparent s photos. Sadly, the higher resolution copies of said photos is lost but she took the work of not just scanning the photos, but assembling them in presentations, telling a story, introducing my older relatives, many of them missing 40 or more years before my birth. But said presentations were built using Flash. Right, not my choice of tool, and I told her back in the day but given I wasn t around to do the work in what I d chosen (a standards-abiding format, naturally), and given my graphic design skills are nonexistant Several years ago, when Adobe pulled the plug on the Flash format, we realized they would no longer be accessible. I managed to get the photos out of the preentations, but lost the narration, that is a great part of the work. Three days ago, however, I read a post on https://www.osnews.com that made me jump to action: https://www.osnews.com/story/138350/ruffle-an-open-source-flash-player-emulator/.

Ruffle is an open source Flash Player emulator, written in Rust and compiled to WASM. Even though several OSnews readers report it to be buggy to play some Flash games they long for, it worked just fine for a simple slideshow presentator. So I managed to bring it back to life! Yes, I d like to make a better index page, but that will come later I am now happy and proud to share with you:

Acariciando la ausencia: Familia Iszaevich Fajerstein, 1900 2000 (which would be roughly translated as Caressing the absence: Iszaevich Fajerstein family, 1900-2000).

Niels Thykier: Making debputy: Writing declarative parsing logic

In this blog post, I will cover how debputy parses its manifest and the conceptual improvements I did to make parsing of the manifest easier. All instructions to debputy are provided via the debian/debputy.manifest file and said manifest is written in the YAML format. After the YAML parser has read the basic file structure, debputy does another pass over the data to extract the information from the basic structure. As an example, the following YAML file:

manifest-version: "0.1"
installations:
  - install:
      source: foo
      dest-dir: usr/bin

would be transformed by the YAML parser into a structure resembling:

 
  "manifest-version": "0.1",
  "installations": [
      
       "install":  
         "source": "foo",
         "dest-dir": "usr/bin",
        
      
  ]

This structure is then what debputy does a pass on to translate this into an even higher level format where the "install" part is translated into an InstallRule. In the original prototype of debputy, I would hand-write functions to extract the data that should be transformed into the internal in-memory high level format. However, it was quite tedious. Especially because I wanted to catch every possible error condition and report "You are missing the required field X at Y" rather than the opaque KeyError: X message that would have been the default. Beyond being tedious, it was also quite error prone. As an example, in debputy/0.1.4 I added support for the install rule and you should allegedly have been able to add a dest-dir: or an as: inside it. Except I crewed up the code and debputy was attempting to look up these keywords from a dict that could never have them. Hand-writing these parsers were so annoying that it demotivated me from making manifest related changes to debputy simply because I did not want to code the parsing logic. When I got this realization, I figured I had to solve this problem better. While reflecting on this, I also considered that I eventually wanted plugins to be able to add vocabulary to the manifest. If the API was "provide a callback to extract the details of whatever the user provided here", then the result would be bad.

Most plugins would probably throw KeyError: X or ValueError style errors for quite a while. Worst case, they would end on my table because the user would have a hard time telling where debputy ends and where the plugins starts. "Best" case, I would teach debputy to say "This poor error message was brought to you by plugin foo. Go complain to them". Either way, it would be a bad user experience.

This even assumes plugin providers would actually bother writing manifest parsing code. If it is that difficult, then just providing a custom file in debian might tempt plugin providers and that would undermine the idea of having the manifest be the sole input for debputy.

So beyond me being unsatisfied with the current situation, it was also clear to me that I needed to come up with a better solution if I wanted externally provided plugins for debputy. To put a bit more perspective on what I expected from the end result:

It had to cover as many parsing errors as possible. An error case this code would handle for you, would be an error where I could ensure it sufficient degree of detail and context for the user.

It should be type-safe / provide typing support such that IDEs/mypy could help you when you work on the parsed result.

It had to support "normalization" of the input, such as

           # User provides
           - install: "foo"
           # Which is normalized into:
           - install:
               source: "foo"
4) It must be simple to tell  debputy  what input you expected.

At this point, I remembered that I had seen a Python (PYPI) package where you could give it a TypedDict and an arbitrary input (Sadly, I do not remember the name). The package would then validate the said input against the TypedDict. If the match was successful, you would get the result back casted as the TypedDict. If the match was unsuccessful, the code would raise an error for you. Conceptually, this seemed to be a good starting point for where I wanted to be. Then I looked a bit on the normalization requirement (point 3). What is really going on here is that you have two "schemas" for the input. One is what the programmer will see (the normalized form) and the other is what the user can input (the manifest form). The problem is providing an automatic normalization from the user input to the simplified programmer structure. To expand a bit on the following example:

# User provides
- install: "foo"
# Which is normalized into:
- install:
    source: "foo"

Given that install has the attributes source, sources, dest-dir, as, into, and when, how exactly would you automatically normalize "foo" (str) into source: "foo"? Even if the code filtered by "type" for these attributes, you would end up with at least source, dest-dir, and as as candidates. Turns out that TypedDict actually got this covered. But the Python package was not going in this direction, so I parked it here and started looking into doing my own. At this point, I had a general idea of what I wanted. When defining an extension to the manifest, the plugin would provide debputy with one or two definitions of TypedDict. The first one would be the "parsed" or "target" format, which would be the normalized form that plugin provider wanted to work on. For this example, lets look at an earlier version of the install-examples rule:

# Example input matching this typed dict.
#    
#       "source": ["foo"]
#       "into": ["pkg"]
#    
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]

In this form, the install-examples has two attributes - both are list of strings. On the flip side, what the user can input would look something like this:

# Example input matching this typed dict.
#    
#       "source": "foo"
#       "into": "pkg"
#    
#
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
FullInstallExamplesManifestFormat = Union[
    InstallExamplesManifestFormat,
    List[str],
    str,
]

The idea was that the plugin provider would use these two definitions to tell debputy how to parse install-examples. Pseudo-registration code could look something like:

def _handler(
    normalized_form: InstallExamplesTargetFormat,
) -> InstallRule:
    ...  # Do something with the normalized form and return an InstallRule.
concept_debputy_api.add_install_rule(
  keyword="install-examples",
  target_form=InstallExamplesTargetFormat,
  manifest_form=FullInstallExamplesManifestFormat,
  handler=_handler,
)

This was my conceptual target and while the current actual API ended up being slightly different, the core concept remains the same.

From concept to basic implementation Building this code is kind like swallowing an elephant. There was no way I would just sit down and write it from one end to the other. So the first prototype of this did not have all the features it has now. Spoiler warning, these next couple of sections will contain some Python typing details. When reading this, it might be helpful to know things such as Union[str, List[str]] being the Python type for either a str (string) or a List[str] (list of strings). If typing makes your head spin, these sections might less interesting for you. To build this required a lot of playing around with Python's introspection and typing APIs. My very first draft only had one "schema" (the normalized form) and had the following features:

Read TypedDict.__required_attributes__ and TypedDict.__optional_attributes__ to determine which attributes where present and which were required. This was used for reporting errors when the input did not match.

Read the types of the provided TypedDict, strip the Required / NotRequired markers and use basic isinstance checks based on the resulting type for str and List[str]. Again, used for reporting errors when the input did not match.

This prototype did not take a long (I remember it being within a day) and worked surprisingly well though with some poor error messages here and there. Now came the first challenge, adding the manifest format schema plus relevant normalization rules. The very first normalization I did was transforming into: Union[str, List[str]] into into: List[str]. At that time, source was not a separate attribute. Instead, sources was a Union[str, List[str]], so it was the only normalization I needed for all my use-cases at the time. There are two problems when writing a normalization. First is determining what the "source" type is, what the target type is and how they relate. The second is providing a runtime rule for normalizing from the manifest format into the target format. Keeping it simple, the runtime normalizer for Union[str, List[str]] -> List[str] was written as:

def normalize_into_list(x: Union[str, List[str]]) -> List[str]:
    return x if isinstance(x, list) else [x]

This basic form basically works for all types (assuming none of the types will have List[List[...]]). The logic for determining when this rule is applicable is slightly more involved. My current code is about 100 lines of Python code that would probably lose most of the casual readers. For the interested, you are looking for _union_narrowing in declarative_parser.py With this, when the manifest format had Union[str, List[str]] and the target format had List[str] the generated parser would silently map a string into a list of strings for the plugin provider. But with that in place, I had covered the basics of what I needed to get started. I was quite excited about this milestone of having my first keyword parsed without handwriting the parser logic (at the expense of writing a more generic parse-generator framework).

Adding the first parse hint With the basic implementation done, I looked at what to do next. As mentioned, at the time sources in the manifest format was Union[str, List[str]] and I considered to split into a source: str and a sources: List[str] on the manifest side while keeping the normalized form as sources: List[str]. I ended up committing to this change and that meant I had to solve the problem getting my parser generator to understand the situation:

# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]

There are two related problems to solve here:

How will the parser generator understand that source should be normalized and then mapped into sources?

Once that is solved, the parser generator has to understand that while source and sources are declared as NotRequired, they are part of a exactly one of rule (since sources in the target form is Required). This mainly came down to extra book keeping and an extra layer of validation once the previous step is solved.

While working on all of this type introspection for Python, I had noted the Annotated[X, ...] type. It is basically a fake type that enables you to attach metadata into the type system. A very random example:

# For all intents and purposes,  foo  is a string despite all the  Annotated  stuff.
foo: Annotated[str, "hello world"] = "my string here"

The exciting thing is that you can put arbitrary details into the type field and read it out again in your introspection code. Which meant, I could add "parse hints" into the type. Some "quick" prototyping later (a day or so), I got the following to work:

# Map from
#      
#        "source": "foo"  # (or "sources": ["foo"])
#        "into": "pkg"
#      
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
#      
#        "source": ["foo"]
#        "into": ["pkg"]
#      
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]

Without me (as a plugin provider) writing a line of code, I can have debputy rename or "merge" attributes from the manifest form into the normalized form. Obviously, this required me (as the debputy maintainer) to write a lot code so other me and future plugin providers did not have to write it.

High level typing At this point, basic normalization between one mapping to another mapping form worked. But one thing irked me with these install rules. The into was a list of strings when the parser handed them over to me. However, I needed to map them to the actual BinaryPackage (for technical reasons). While I felt I was careful with my manual mapping, I knew this was exactly the kind of case where a busy programmer would skip the "is this a known package name" check and some user would typo their package resulting in an opaque KeyError: foo. Side note: "Some user" was me today and I was super glad to see debputy tell me that I had typoed a package name (I would have been more happy if I had remembered to use debputy check-manifest, so I did not have to wait through the upstream part of the build that happened before debhelper passed control to debputy...) I thought adding this feature would be simple enough. It basically needs two things:

Conversion table where the parser generator can tell that BinaryPackage requires an input of str and a callback to map from str to BinaryPackage. (That is probably lie. I think the conversion table came later, but honestly I do remember and I am not digging into the git history for this one)

At runtime, said callback needed access to the list of known packages, so it could resolve the provided string.

It was not super difficult given the existing infrastructure, but it did take some hours of coding and debugging. Additionally, I added a parse hint to support making the into conditional based on whether it was a single binary package. With this done, you could now write something like:

# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[BinaryPackage, List[BinaryPackage]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[
        Annotated[
            List[BinaryPackage],
            DebputyParseHint.required_when_multi_binary()
        ]
    ]

Code-wise, I still had to check for into being absent and providing a default for that case (that is still true in the current codebase - I will hopefully fix that eventually). But I now had less room for mistakes and a standardized error message when you misspell the package name, which was a plus.

The added side-effect - Introspection A lovely side-effect of all the parsing logic being provided to debputy in a declarative form was that the generated parser snippets had fields containing all expected attributes with their types, which attributes were required, etc. This meant that adding an introspection feature where you can ask debputy "What does an install rule look like?" was quite easy. The code base already knew all of this, so the "hard" part was resolving the input the to concrete rule and then rendering it to the user. I added this feature recently along with the ability to provide online documentation for parser rules. I covered that in more details in my blog post Providing online reference documentation for debputy in case you are interested. :)

Wrapping it up This was a short insight into how debputy parses your input. With this declarative technique:

The parser engine handles most of the error reporting meaning users get most of the errors in a standard format without the plugin provider having to spend any effort on it. There will be some effort in more complex cases. But the common cases are done for you.

It is easy to provide flexibility to users while avoiding having to write code to normalize the user input into a simplified programmer oriented format.

The parser handles mapping from basic types into higher forms for you. These days, we have high level types like FileSystemMode (either an octal or a symbolic mode), different kind of file system matches depending on whether globs should be performed, etc. These types includes their own validation and parsing rules that debputy handles for you.

Introspection and support for providing online reference documentation. Also, debputy checks that the provided attribute documentation covers all the attributes in the manifest form. If you add a new attribute, debputy will remind you if you forget to document it as well. :)

In this way everybody wins. Yes, writing this parser generator code was more enjoyable than writing the ad-hoc manual parsers it replaced. :)

19 January 2024

Russell Coker: 2.5Gbit Ethernet

I just decided to upgrade the core of my home network from 1Gbit to 2.5Gbit. I didn t really need to do this, it was only about 5 years ago that I upgrade from 100Mbit to 1Gbit. but it s cheap and seemed interesting. I decided to do it because a 2.5Gbit switch was listed as cheap on Ozbargain Computing [1], that was $40.94 delivered. If you are in Australia and like computers then Ozbargain is a site worth polling, every day there s interesting things at low prices. The seller of the switch is KeeplinkStore [2] who distinguished themselves by phoning me from China to inform me that I had ordered a switch with a UK plug for delivery to Australia and suggesting that I cancel the order and make a new order with an Australian plug. It wouldn t have been a big deal if I had received a UK plug as I ve got a collection of adaptors but it was still nice of them to make it convenient for me. The switch basically does what it s expected to do and has no fan so it s quiet. I got a single port 2.5Gbit PCIe card for $18.77 and a dual port card for $34.07. Those cards are a little expensive when compared to 1Gbit cards but very cheap when compared to the computers they are installed in. These cards use the Realtek RTL8125 chipset and work well. I got a USB-3 2.5Gbit device for $17.43. I deliberately didn t get USB-C because I still use laptops without USB-C and most of the laptops with USB-C only have a single USB-C port which is used for power. I don t plan to stop using my 100Mbit USB ethernet device because most of the time I don t need a lot of speed. But sometimes I do things like testing auto-install on laptops and then having something faster than Gigabit is good. This card worked at 1Gbit speed on a 1Gbit network when used with a system running Debian/Bookworm with kernel 6.1 and worked at 2.5Gbit speed when connected to my LicheePi RISC-V system running Linux 5.10, but it would only do 100Mbit on my laptop running Debian/Unstable with kernel 6.6 (Debian Bug #1061095) [3]. It s a little disappointing but not many people have such hardware so it probably doesn t get a lot of testing. For the moment I plan to just use a 1Gbit USB Ethernet device most of the time and if I really need the speed I ll just use an older kernel. I did some tests with wget and curl to see if I could get decent speeds. When using wget 1.21.3 on Debian/Bookworm I got transfer speeds of 103MB/s and 18.8s of system CPU time out of 23.6s of elapsed time. Curl on Debian/Bookworm did 203MB/s and took 10.7s of system CPU time out of 11.8s elapsed time. The difference is that curl was using 100KB read buffers and a mix of 12K and 4K write buffers while wget was using 8KB read buffers and 4KB write buffers. On Debian/Unstable wget 1.21.4 uses 64K read buffers and a mix of 4K and 60K write buffers and gets a speed of 208MB/s. As an experiment I changed the read buffer size for wget to 256K and that got the speed up to around 220MB/s but it was difficult to measure as the occasional packet loss slowed things down. The pattern of writing 4K and then writing the rest continued, it seemed related to fwrite() buffering. For anyone else who wants to experiment with the code, the wget code is simpler (due to less features) and the package builds a lot faster (due to fewer tests) so that s the one to work on. The client machine for these tests has a E5-2696 v3 CPU, this doesn t compare well to some of the recent AMD CPUs on single-core performance but is still a decently powerful system. Getting good performance at Gigabit speeds on an ARM or RISC-V system is probably going to be a lot harder than getting good performance at 2.5Gbit speeds on this system. In conclusion 2.5Gbit basically works apart from a problem with new kernels and a problem with the old version of wget. I expect that when Debian/Trixie is released (probably mid 2025) things will work well. For good transfer rates use wget version 1.21.4 or newer or use curl. As an aside I use a 1500byte MTU because I have some 100baseT systems on my LAN and the settings regarding TCP acceleration etc are all the defaults.

18 January 2024

Russell Coker: LicheePi 4A (RISC-V) First Look

I Just bought a LicheePi 4A RISC-V embedded computer (like a RaspberryPi but with a RISC-V CPU) for $322.68 from Aliexpress (the official site for buying LicheePi devices). Here is the Sipheed web page about it and their other recent offerings [1]. I got the version with 16G of RAM and 128G of storage, I probably don t need that much storage (I can use NFS or USB) but 16G of RAM is good for VMs. Here is the Wiki about this board [2]. Configuration When you get one of these devices you should make setting up ssh server your first priority. I found the HDMI output to be very unreliable. The first monitor I tried was a Samsung 4K monitor dating from when 4K was a new thing, the LicheePi initially refused to operate at a resolution higher than 1024*768 but later on switched to 4K resolution when resuming from screen-blank for no apparent reason (and the window manager didn t support this properly). On the Dell 4K monitor I use on my main workstation it sometimes refused to talk to it and occasionally worked. I got it running at 1920*1080 without problems and then switched it to 4K and it lost video sync and never talked to that monitor again. On my Desklab portabable 4K monitor I got it to display in 4K resolution but only the top left 1/4 of the screen displayed. The issues with HDMI monitor support greatly limit the immediate potential for using this as a workstation. It doesn t make it impossible but would be fiddly at best. It s quite likely that a future OS update will fix this. But at the moment it s best used as a server. The LicheePi has a custom Linux distribution based on Ubuntu so you want too put something like the following in /etc/network/interfaces to make it automatically connect to the ethernet when plugged in:

auto end0
iface end0 inet dhcp

Then to get sshd to start you have to run the following commands to generate ssh host keys that aren t zero bytes long:

rm /etc/ssh/ssh_host_*
systemctl restart ssh.service

It appears to have wifi hardware but the OS doesn t recognise it. This isn t a priority for me as I mostly want to use it as a server. Performance For the first test of performance I created a 100MB file from /dev/urandom and then tried compressing it on various systems. With zstd -9 it took 16.893 user seconds on the LicheePi4A, 0.428s on my Thinkpad X1 Carbon Gen5 with a i5-6300U CPU (Debian/Unstable), 1.288s on my E5-2696 v3 workstation (Debian/Bookworm), 0.467s on the E5-2696 v3 running Debian/Unstable, 2.067s on a E3-1271 v3 server, and 7.179s on the E3-1271 v3 system emulating a RISC-V system via QEMU running Debian/Unstable. It s very impressive that the QEMU emulation is fast enough that emulating a different CPU architecture is only 3.5* slower for this test (or maybe 10* slower if it was running Debian/Unstable on the AMD64 code)! The emulated RISC-V is also more than twice as fast as real RISC-V hardware and probably of comparable speed to real RISC-V hardware when running the same versions (and might be slightly slower if running the same version of zstd) which is a tribute to the quality of emulation. One performance issue that most people don t notice is the time taken to negotiate ssh sessions. It s usually not noticed because the common CPUs have got faster at about the same rate as the algorithms for encryption and authentication have become more complex. On my i5-6300U laptop it takes 0m0.384s to run ssh -i ~/.ssh/id_ed25519 localhost id with the below server settings (taken from advice on ssh-audit.com [3] for a secure ssh configuration). On the E3-1271 v3 server it is 0.336s, on the QMU system it is 28.022s, and on the LicheePi it is 0.592s. By this metric the LicheePi is about 80% slower than decent x86 systems and the QEMU emulation of RISC-V is 73* slower than the x86 system it runs on. Does crypto depend on instructions that are difficult to emulate?

HostKey /etc/ssh/ssh_host_ed25519_key
KexAlgorithms -ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group14-sha256
MACs -umac-64-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1

I haven t yet tested the performance of Ethernet (what routing speed can you get through the 2 gigabit ports?), emmc storage, and USB. At the moment I ve been focused on using RISC-V as a test and development platform. My conclusion is that I m glad I don t plan to compile many kernels or anything large like LibreOffice. But that for typical development that I do it will be quite adequate. The speed of Chromium seems adequate in basic tests, but the video output hasn t worked reliably enough to do advanced tests. Hardware Features Having two Gigabit Ethernet ports, 4 USB-3 ports, and Wifi on board gives some great options for using this as a router. It s disappointing that they didn t go with 2.5Gbit as everyone seems to be doing that nowadays but Gigabit is enough for most things. Having only a single HDMI port and not supporting USB-C docks (the USB-C port appears to be power only) limits what can be done for workstation use and for controlling displays. I know of people using small ARM computers attached to the back of large TVs for advertising purposes and that isn t going to be a great option for this. The CPU and RAM apparently uses a lot of power (which is relative the entire system draws up to 2A at 5V so the CPU would be something below 5W). To get this working a cooling fan has to be stuck to the CPU and RAM chips via a layer of thermal stuff that resembles a fine sheet of blu-tack in both color and stickyness. I am disappointed that there isn t any more solid form of construction, to mount this on a wall or ceiling some extra hardware would be needed to secure this. Also if they just had a really big copper heatsink I think that would be better. 80386 CPUs with similar TDP were able to run without a fan. I wonder how things would work with all USB ports in use. It s expected that a USB port can supply a minimum of 2.5W which means that all the ports could require 10W if they were active. Presumably something significantly less than 5W is available for the USB ports. Other Devices Sipheed has a range of other devices in the works. They currently sell the LicheeCluster4A which support 7 compute modules for a cluster in a box. This has some interesting potential for testing and demonstrating cluster software but you could probably buy an AMD64 system with more compute power for less money. The Lichee Console 4A is a tiny laptop which could be useful for people who like the 7 laptop form factor, unfortunately it only has a 1280*800 display if it had the same resolution display as a typical 7 phone I would have bought one. The next device that appeals to me is the soon to be released Lichee Pad 4A which is a 10.1 tablet with 1920*1200 display, Wifi6, Bluetooth 5.4, and 16G of RAM. It also has 1 USB-C connection, 2*USB-3 sockets, and support for an external card with 2*Gigabit ethernet. It s a tablet as a laptop without keyboard instead of the more common larger phone design model. They are also about to release the LicheePadMax4A which is similar to the other tablet but with a 14 2240*1400 display and which ships with a keyboard to make it essentially a laptop with detachable keyboard. Conclusion At this time I wouldn t recommend that this device be used as a workstation or laptop, although the people who want to do such things will probably do it anyway regardless of my recommendations. I think it will be very useful as a test system for RISC-V development. I have some friends who are interested in this sort of thing and I can give them VMs. It is a bit expensive. The Sipheed web site boasts about the LicheePi4 being faster than the RaspberryPi4, but it s not a lot faster and the RaspberryPi4 is much cheaper ($127 or $129 for one with 8G of RAM). The RaspberryPi4 has two HDMI ports but a limit of 8G of RAM while the LicheePi has up to 16G of RAM and two Gigabit Ethernet ports but only a single HDMI port. It seems that the RaspberryPi4 might win if you want a cheap low power desktop system. At this time I think the reason for this device is testing out RISC-V as an alternative to the AMD64 and ARM64 architectures. An open CPU architecture goes well with free software, but it isn t just people who are into FOSS who are testing such things. I know some corporations are trying out RISC-V as a way of getting other options for embedded systems that don t involve paying monopolists. The Lichee Console 4A is probably a usable tiny laptop if the resolution is sufficient for your needs. As an aside I predict that the tiny laptop or pocket computer segment will take off in the near future. There are some AMD64 systems the size of a phone but thicker that run Windows and go for reasonable prices on AliExpress. Hopefully in the near future this device will have better video drivers and be usable as a small and quiet workstation. I won t rule out the possibility of making this my main workstation in the not too distant future, all it needs is reliable 4K display and the ability to decode 4K video. It s performance for web browsing and as an ssh client seems adequate, and that s what matters for my workstation use. But for the moment it s just for server use.

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book):

free software throughout
storage under my control
ability to annotate tasks with URLs (especially bugs and merge requests)
lightweight time tracking (I m OK with having to explicitly tell it when I start and stop tasks)
ability to drive everything from the command line
decent filtering so I don t have to look at my entire to-do list all the time
ability to easily generate billing information for multiple clients
optionally, integration with Android (mainly so I can tick off personal tasks like change bedroom lightbulb or whatever that don t involve being near a computer)

I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task.

task <id>
done

marks it as done. So far so good, so a redacted version of my to-do list looks like this:

$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail

Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:

$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).

When I stop work on something, I do task active to find the ID, then

task
<id> stop

. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):

#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service

I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):

server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem

Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

16 January 2024

Russ Allbery: Review: Making Money

Review: Making Money, by Terry Pratchett

Series:	Discworld #36
Publisher:	Harper
Copyright:	October 2007
Printing:	November 2014
ISBN:	0-06-233499-9
Format:	Mass market
Pages:	473

Making Money is the 36th Discworld novel, the second Moist von Lipwig book, and a direct sequel to Going Postal. You could start the series with Going Postal, but I would not start here. The post office is running like a well-oiled machine, Adora Belle is out of town, and Moist von Lipwig is getting bored. It's the sort of boredom that has him picking his own locks, taking up Extreme Sneezing, and climbing buildings at night. He may not realize it, but he needs something more dangerous to do. Vetinari has just the thing. The Royal Bank of Ankh-Morpork, unlike the post office before Moist got to it, is still working. It is a stolid, boring institution doing stolid, boring things for rich people. It is also the battleground for the Lavish family past-time: suing each other and fighting over money. The Lavishes are old money, the kind of money carefully entangled in trusts and investments designed to ensure the family will always have money regardless of how stupid their children are. Control of the bank is temporarily in the grasp of Joshua Lavish's widow Topsy, who is not a true Lavish, but the vultures are circling. Meanwhile, Vetinari has grand city infrastructure plans, and to carry them out he needs financing. That means he needs a functional bank, and preferably one that is much less conservative. Moist is dubious about running a bank, and even more reluctant when Topsy Lavish sees him for exactly the con artist he is. His hand is forced when she dies, and Moist discovers he has inherited her dog, Mr. Fusspot. A dog that now owns 51% of the Royal Bank and therefore is the chairman of the bank's board of directors. A dog whose safety is tied to Moist's own by way of an expensive assassination contract. Pratchett knew he had a good story with Going Postal, so here he runs the same formula again. And yes, I was happy to read it again. Moist knows very little about banking but quite a lot about pretending something will work until it does, which has more to do with banking than it does with running a post office. The bank employs an expert, Mr. Bent, who is fanatically devoted to the gold standard and the correctness of the books and has very little patience for Moist. There are golem-related hijinks. The best part of this book is Vetinari, who is masterfully manipulating everyone in the story and who gets in some great lines about politics.

"We are not going to have another wretched empire while I am Patrician. We've only just got over the last one."

Also, Vetinari processing dead letters in the post office was an absolute delight. Making Money does have the recurring Pratchett problem of having a fairly thin plot surrounded by random... stuff. Moist's attempts to reform the city currency while staying ahead of the Lavishes is only vaguely related to Mr. Bent's plot arc. The golems are unrelated to the rest of the plot other than providing a convenient deus ex machina. There is an economist making water models in the bank basement with an Igor, which is a great gag but has essentially nothing to do with the rest of the book. One of the golems has been subjected to well-meaning older ladies and 1950s etiquette manuals, which I thought was considerably less funny (and somewhat creepier) than Pratchett did. There are (sigh) clowns, which continue to be my least favorite Ankh-Morpork world-building element. At least the dog was considerably less annoying than I was afraid it was going to be. This grab-bag randomness is a shame, since I think there was room here for a more substantial plot that engaged fully with the high weirdness of finance. Unfortunately, this was a bit like the post office in Going Postal: Pratchett dives into the subject just enough to make a few wry observations and a few funny quips, and then resolves the deeper issues off-camera. Moist tries to invent fiat currency, because of course he does, and Pratchett almost takes on the gold standard, only to veer away at the last minute into vigorous hand-waving. I suspect part of the problem is that I know a little bit too much about finance, so I kept expecting Pratchett to take the humorous social commentary a couple of levels deeper. On a similar note, the villains have great potential that Pratchett undermines by adding too much over-the-top weirdness. I wish Cosmo Lavish had been closer to what he appears to be at the start of the book: a very wealthy and vindictive man (and a reference to Cosimo de Medici) who doesn't have Moist's ability to come up with wildly risky gambits but who knows considerably more than he does about how banking works. Instead, Pratchett gives him a weird obsession that slowly makes him less sinister and more pathetic, which robs the book of a competent antagonist for Moist. The net result is still a fun book, and a solid Discworld entry, but it lacks the core of the best series entries. It felt more like a skit comedy show than a novel, but it's an excellent skit comedy show with the normal assortment of memorable Pratchettisms. Certainly if you've read this far, or even if you've only read Going Postal, you'll want to read Making Money as well. Followed by Unseen Academicals. The next Moist von Lipwig book is Raising Steam. Rating: 8 out of 10

15 January 2024

Russ Allbery: Review: The Library of Broken Worlds

Review: The Library of Broken Worlds, by Alaya Dawn Johnson

Publisher:	Scholastic Press
Copyright:	June 2023
ISBN:	1-338-29064-9
Format:	Kindle
Pages:	446

The Library of Broken Worlds is a young-adult far-future science fantasy. So far as I can tell, it's stand-alone, although more on that later in the review. Freida is the adopted daughter of Nadi, the Head Librarian, and her greatest wish is to become a librarian herself. When the book opens, she's a teenager in highly competitive training. Freida is low-wetware, without the advanced and expensive enhancements of many of the other students competing for rare and prized librarian positions, which she makes up for by being the most audacious. She doesn't need wetware to commune with the library material gods. If one ventures deep into their tunnels and consumes their crystals, direct physical communion is possible. The library tunnels are Freida's second home, in part because that's where she was born. She was created by the Library, and specifically by Iemaja, the youngest of the material gods. Precisely why is a mystery. To Nadi, Freida is her daughter. To Quinn, Nadi's main political rival within the library, Freida is a thing, a piece of the library, a secondary and possibly rogue AI. A disruptive annoyance. The Library of Broken Worlds is the sort of science fiction where figuring out what is going on is an integral part of the reading experience. It opens with a frame story of an unnamed girl (clearly Freida) waking the god Nameren and identifying herself as designed for deicide. She provokes Nameren's curiosity and offers an Arabian Nights bargain: if he wants to hear her story, he has to refrain from killing her for long enough for her to tell it. As one might expect, the main narrative doesn't catch up to the frame story until the very end of the book. The Library is indeed some type of library that librarians can search for knowledge that isn't available from more mundane sources, but Freida's personal experience of it is almost wholly religious and oracular. The library's material gods are identified as AIs, but good luck making sense of the story through a science fiction frame, even with a healthy allowance for sufficiently advanced technology being indistinguishable from magic. The symbolism and tone is entirely fantasy, and late in the book it becomes clear that whatever the material gods are, they're not simple technological AIs in the vein of, say, Banks's Ship Minds. Also, the Library is not solely a repository of knowledge. It is the keeper of an interstellar peace. The Library was founded after the Great War, to prevent a recurrence. It functions as a sort of legal system and grand tribunal in ways that are never fully explained. As you might expect, that peace is based more on stability than fairness. Five of the players in this far future of humanity are the Awilu, the most advanced society and the first to leave Earth (or Tierra as it's called here); the Mah m, who possess the material war god Nameren of the frame story; the Lunars and Martians, who dominate the Sol system; and the surviving Tierrans, residents of a polluted and struggling planet that is ruthlessly exploited by the Lunars. The problem facing Freida and her friends at the start of the book is a petition brought by a young Tierran against Lunar exploitation of his homeland. His name is Joshua, and Freida is more than half in love with him. Joshua's legal argument involves interpretation of the freedom node of the treaty that ended the Great War, a node that precedent says gives the Lunars the freedom to exploit Tierra, but which Joshua claims has a still-valid originalist meaning granting Tierrans freedom from exploitation. There is, in short, a lot going on in this book, and "never fully explained" is something of a theme. Freida is telling a story to Nameren and only explains things Nameren may not already know. The reader has to puzzle out the rest from the occasional hint. This is made more difficult by the tendency of the material gods to communicate only in visions or guided hallucinations, full of symbolism that the characters only partly explain to the reader. Nonetheless, this did mostly work, at least for me. I started this book very confused, but by about the midpoint it felt like the background was coming together. I'm still not sure I understand the aurochs, baobab, and cicada symbolism that's so central to the framing story, but it's the pleasant sort of stretchy confusion that gives my brain a good workout. I wish Johnson had explained a few more things plainly, particularly near the end of the book, but my remaining level of confusion was within my tolerances. Unfortunately, the ending did not work for me. The first time I read it, I had no idea what it meant. Lots of baffling, symbolic things happened and then the book just stopped. After re-reading the last 10%, I think all the pieces of an ending and a bit of an explanation are there, but it's absurdly abbreviated. This is another book where the author appears to have been finished with the story before I was. This keeps happening to me, so this probably says something more about me than it says about books, but I want books to have an ending. If the characters have fought and suffered through the plot, I want them to have some space to be happy and to see how their sacrifices play out, with more detail than just a few vague promises. If much of the book has been puzzling out the nature of the world, I would like some concrete confirmation of at least some of my guesswork. And if you're going to end the book on radical transformation, I want to see the results of that transformation. Johnson does an excellent job showing how brutal the peace of the powerful can be, and is willing to light more things on fire over the course of this book than most authors would, but then doesn't offer the reader much in the way of payoff. For once, I wish this stand-alone turned out to be a series. I think an additional book could be written in the aftermath of this ending, and I would definitely read that novel. Johnson has me caring deeply about these characters and fascinated by the world background, and I'd happily spend another 450 pages finding out what happens next. But, frustratingly, I think this ending was indeed intended to wrap up the story. I think this book may fall between a few stools. Science fiction readers who want mysterious future worlds to be explained by the end of the book are going to be frustrated by the amount of symbolism, allusion, and poetic description. Literary fantasy readers, who have a higher tolerance for that style, are going to wish for more focused and polished writing. A lot of the story is firmly YA: trying and failing to fit in, developing one's identity, coming into power, relationship drama, great betrayals and regrets, overcoming trauma and abuse, and unraveling lies that adults tell you. But this is definitely not a straight-forward YA plot or world background. It demands a lot from the reader, and while I am confident many teenage readers would rise to that challenge, it seems like an awkward fit for the YA marketing category. About 75% of the way in, I would have told you this book was great and you should read it. The ending was a let-down and I'm still grumpy about it. I still think it's worth your attention if you're in the mood for a sink-or-swim type of reading experience. Just be warned that when the ride ends, I felt unceremoniously dumped on the pavement. Content warnings: Rape, torture, genocide. Rating: 7 out of 10

14 January 2024

Debian Brasil: MiniDebConf BH 2024 - abertura de inscri o e chamada de atividades

Est aberta a inscri o de participantes e a chamada de atividades para a MiniDebConf Belo Horizonte 2024 e para o FLISOL - Festival Latino-americano de Instala o de Software Livre. Veja abaixo algumas informa es importantes: Data e local da MiniDebConf e do FLISOL A MiniDebConf acontecer de 27 a 30 de abril no Campus Pampulha da UFMG - Universidade Federal de Minas Gerais. No dia 27 (s bado) tamb m realizaremos uma edi o do FLISOL - Festival Latino-americano de Instala o de Software Livre, evento que acontece no mesmo dia em v rias cidades da Am rica Latina. Enquanto a MiniDebConf ter atividades focados no Debian, o FLISOL ter atividades gerais sobre Software Livre e temas relacionados como linguagem de programa o, CMS, administra o de redes e sistemas, filosofia, liberdade, licen as, etc. Inscri o gratuita e oferta de bolsas Voc j pode realizar a sua inscri o gratuita para a MiniDebConf Belo Horizonte 2024. A MiniDebConf um evento aberto a todas as pessoas, independente do seu n vel de conhecimento sobre Debian. O mais importante ser reunir a comunidade para celebrar um dos maiores projeto de Software Livre no mundo, por isso queremos receber desde usu rios(as) inexperientes que est o iniciando o seu contato com o Debian at Desenvolvedores(as) oficiais do projeto. Ou seja, est o todos(as) convidados(as)! Este ano estamos ofertando bolsas de hospedagem e passagens para viabilizar a vinda de pessoas de outras cidades que contribuem para o Projeto Debian. Contribuidores(as) n o oficiais, DMs e DDs podem solicitar as bolsas usando o formul rio de inscri o. Tamb m estamos ofertando bolsas de alimenta o para todos(as) os(as) participantes, mesmo n o contribuidores(as), e pessoas que moram na regi o de BH. Os recursos financeiros s o bastante limitados, mas tentaremos atender o m ximo de pedidos. Se voc pretende pedir alguma dessas bolsas, acesse este link e veja mais informa es antes de realizar a sua inscri o: A inscri o (sem bolsas) poder ser feita at a data do evento, mas temos uma data limite para o pedido de bolsas de hospedagem e passagens, por isso fique atento(a) ao prazo final: at 18 de fevereiro. Como estamos usando mesmo formul rio para os dois eventos, a inscri o ser v lida tanto para a MiniDebConf quanto para o FLISOL. Para se inscrever, acesse o site, v em Criar conta. Criei a sua conta (preferencialmente usando o Salsa) e acesse o seu perfil. L voc ver o bot o de Se inscrever. https://bh.mini.debconf.org Chamada de atividades Tamb m est aberta a chamada de atividades tanto para MiniDebConf quanto para o FLISOL. Para mais informa es, acesse este link. Fique atento ao prazo final para enviar sua proposta de atividade: at 18 de fevereiro. Contato Qualquer d vida, mande um email para contato@debianbrasil.org.br Organiza o

13 January 2024

Freexian Collaborators: Debian Contributions: LXD/Incus backend bug, /usr-merge updates, gcc-for-host, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

LXD/Incus backend bug in autopkgtest by Stefano Rivera While working on the Python 3.12 transition, Stefano repeatedly ran into a bug in autopkgtest when using LXD (or in the future Incus), that caused it to hang when running cython s multi-hour autopkgtests. After some head-banging, the bug turned out to be fairly straightforward: LXD didn t shut down on receiving a SIGTERM, so when a testsuite timed out, it would hang forever. A simple fix has been applied.

/usr-merge, by Helmut Grohne Thanks to Christian Hofstaedtler and others, the effort is moving into a community effort and the work funded by Freexian becomes more difficult to separate from non-funded work. In particular, since the community fully handled all issues around lost `udev` rules, `dh_installudev` now installs rules to `/usr`. The story around diversions took another detour. We learned that conflicts do not reliably prevent concurrent unpack and the reiterated mitigation for `molly-guard` triggered this. After a bit of back and forth and consultation with the developer mailing list, we concluded that avoiding the problematic behavior when using `apt` or an `apt`-based upgrader combined with a loss mitigation would be good enough. The involved packages `bfh-container`, `molly-guard`, `progress-linux-container` and `systemd` have since been uploaded to `unstable` and the matter seems finally solved except that it doesn t quite work with `sysvinit` yet. The same approach is now being proposed for the diversions of zutils for gzip. We thank involved maintainers for their timely cooperation.

gcc-for-host, by Helmut Grohne Since forever, it has been difficult to correctly express a toolchain build dependency. This can be seen in the `Build-Depends` of the `linux` source package for instance. While this has been solved for `binutils` a while back, the patches for `gcc` have been unfinished. With lots of constructive feedback from `gcc` package maintainer Matthias Klose, Helmut worked on finalizing and testing these patches. Patch stacks are now available for gcc-13 and gcc-14 and Matthias already included parts of them in test builds for Ubuntu `noble`. Finishing this work would enable us to resolve around 1000 cross build dependency satisfiability issues in unstable.

Miscellaneous contributions

Stefano continued work on the Python 3.12 transition, including uploads of cython, pycxx, numpy, python-greenlet, twisted, foolscap and dh-python.

Stefano reviewed and selected from a new round of DebConf 24 bids, as part of the DebConf Committee. Busan, South Korea was selected.

For debian-printing Thorsten uploaded hplip to unstable to fix a /usr-merge bug and cups to Bookworm to fix bugs related to printing in color.

Utkarsh helped newcomers in mentoring and reviewing their packaging; eg: golang-github-prometheus-community-pgbouncer-exporter.

Helmut sent patches for 42 cross build failures unrelated to the `gcc-for-host` work.

Helmut continues to maintain `rebootstrap`. In December, `blt` started depending on `libjpeg` and this poses a dependency loop. Ideally, Python would stop depending on `blt`. Also `linux-libc-dev` having become `Multi-Arch: foreign` poses non-trivial issues that are not fully resolved yet.

Enrico participated in /usr-merge discussions with Helmut.

Next.

Previous.